Deep Collaborative Learning for Visual Recognition
نویسندگان
چکیده
Deep neural networks are playing an important role in state-of-the-art visual recognition. To represent high-level visual concepts, modern networks are equipped with large convolutional layers, which use a large number of filters and contribute significantly to model complexity. For example, more than half of the weights of AlexNet are stored in the first fully-connected layer (4,096 filters). We formulate the function of a convolutional layer as learning a large visual vocabulary, and propose an alternative way, namely Deep Collaborative Learning (DCL), to reduce the computational complexity. We replace a convolutional layer with a two-stage DCL module, in which we first construct a couple of smaller convolutional layers individually, and then fuse them at each spatial position to consider feature co-occurrence. In mathematics, DCL can be explained as an efficient way of learning compositional visual concepts, in which the vocabulary size increases exponentially while the model complexity only increases linearly. We evaluate DCL on a wide range of visual recognition tasks, including a series of multi-digit number classification datasets, and some generic image classification datasets such as SVHN, CIFAR and ILSVRC2012. We apply DCL to several state-of-the-art network structures, improving the recognition accuracy meanwhile reducing the number of parameters (16.82% fewer in AlexNet).
منابع مشابه
Deep Collaborative Weight-based Classification
One of the biggest problems in deep learning is its difficulty to retain consistent robustness when transferring the model trained on one dataset to another dataset. To conquer the problem, deep transfer learning was implemented to execute various vision tasks by using a pre-trained deep model in a diverse dataset. However, the robustness was often far from state-of-the-art. We propose a collab...
متن کاملRecognition of Visual Events using Spatio-Temporal Information of the Video Signal
Recognition of visual events as a video analysis task has become popular in machine learning community. While the traditional approaches for detection of video events have been used for a long time, the recently evolved deep learning based methods have revolutionized this area. They have enabled event recognition systems to achieve detection rates which were not reachable by traditional approac...
متن کاملOpen-Domain Audio-Visual Speech Recognition: A Deep Learning Approach
Automatic speech recognition (ASR) on video data naturally has access to two modalities: audio and video. In previous work, audio-visual ASR, which leverages visual features to help ASR, has been explored on restricted domains of videos. This paper aims to extend this idea to open-domain videos, for example videos uploaded to YouTube. We achieve this by adopting a unified deep learning approach...
متن کاملCombining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)
Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...
متن کاملA Grouping Hotel Recommender System Based on Deep Learning and Sentiment Analysis
Recommender systems are important tools for users to identify their preferred items and for businesses to improve their products and services. In recent years, the use of online services for selection and reservation of hotels have witnessed a booming growth. Customer’ reviews have replaced the word of mouth marketing, but searching hotels based on user priorities is more time-consuming. This s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1703.01229 شماره
صفحات -
تاریخ انتشار 2017